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Introduction 

In  this  progress  report,  we  present  evidence  that  we  have  developed  a  method  that 
should  be  adequate  for  identification  of  retrotransposition  events  involving  retroviral 
sequences.  The  following  information  from  the  previous  progress  report  outlines  our 
original  reasoning. 

HERVs  which  make  up  an  estimated  1%  of  the  entire  human  genome  are  grouped 
into  single  and  multiple  copy  families.  Although  most  of  them  seem  to  be  defective  due 
to  multiple  termination  codons  and  deletions,  some  of  them  are  full-length  proviruses  and 
transcriptionally  active.  HERV  transcription  may  play  a  role  during  normal  gene 
expression  (some  transcripts  are  very  abundant  in  placenta)  and  it  has  been  suggested  that 
this  role  could  be  a  protective  function  against  superinfection  by  related  exogenous 
retroviruses. 

However,  overexpression  of  HERVs  has  been  reported  in  various  cancer  cell 
lines,  such  as  teratocarcinoma  cell  lines  and  cell  lines  derived  from  testicular  and  lung 
tumors.  Expression  of  HERVs  was  shown  to  be  associated  with  mouse  strains 
susceptibility  to  lung  cancer.  In  our  laboratory,  we  have  found  a  variation  in  the 
expression  of  different  family  members  of  HERVs  in  some  colorectal  cancer  cell  lines. 
The  implications  of  expression  of  HERV  sequences  in  pathophysiological  processes 
remains  to  be  elucidated. 

There  is  substantial  evidence  that  HERVs  are  still  retrotransposing.  As  the 
retroviral  long  terminal  repeat  (LTR)  sequences  of  HERVs  contain  complex  regulatory 
elements  such  as  promoters,  enhancers,  transcription  initiation  sites  and  polyadenylation 
signals,  their  insertion  in  the  vicinity  of  host  gene  could  cause  a  dramatic  change  in  its 
expression.  The  current  project  relies  on  the  hypothesis  that  HERVs  have  a  role  as 
mutagenic  agents  in  at  least  a  subset  of  breast  cancers.  This  research  focuses  on  two 
different  families  of  HERV s:  HERV-H  family,  that  is  the  most  abundant  (1000  copies  in 
addition  to  a  similar  number  of  solitary  LTRs),  and  HERV-K  family,  that  is  characterized 
as  the  most  “active”  family.  We  are  looking  for  differential  HERV  reverse  transcriptase 
activity  in  cancer  lines  in  response  to  growth  factor  and  hormone  treatments  as  well  as  in 
fresh  breast  cancer  tissues.  We  are  trying  to  determine  if  integration  of  cDNA  occurs  at 
an  enhanced  rate  in  cancer  cells  by  looking  for  new  integration  sites  of  HERV  sequences 
in  tumor  tissues.  Thus,  the  purpose  of  this  study  is  to  develop  ways  to  identify  new 
insertional  events  that  might  result  in  breast  cancer,  and  to  gather  other  evidence  that 
retroposition  of  endogenous  retroviruses  is  an  active  mechanism  of  cancer. 

Body 

Summary  of  earlier  results 

This  section  is  synopsized  from  the  previous  annual  report. 

Elevation  of  Reverse  Transcriptase  Activity 

We  demonstrated  several  instances  of  elevated  levels  of  reverse  transcriptase  have 
appeared  in  our  screens  of  normal  vs.  tumor  pairs,  and  in  cancer-derived  cell  lines.  These 
results  were  discussed  in  the  previous  annual  report.  Briefly,  RNA  was  isolated  from  two 
normal  cell  lines,  eight  different  cancer  cell  lines,  and  from  normal  and  cancer  tissues 
originating  from  anonymous  patients.  Reverse  transcriptase  activity  was  measured 
through  the  abundance  of  retroviral  reverse  transcriptase  mRNAs  (HERV-H  family)  in 
the  different  cell  lines  and  normal-tumor  pairs.  Figure  1  shows  the  results  of  RT-PCR 
directed  toward  retroviral  sequences  for  normal-tumor  pairs,  and  for  cancer  derived  cell 
types.  Two  tumors  (lanes  19,  21)  as  well  as  two  tumor  derived  cell  lines  (12,  15)  show 
elevated  levels  of  transcript  for  reverse  transcriptase.  This  behavior  is  consistent  with 
what  would  be  expected  for  a  retroposition  mechanism. 


Figure  1.  Retroviral  Reverse  Transcriptase  transcript  overexpression  in  cell  lines 
and  tumors,  analyzed  by  RT-PCR. 

1-2,  cancer  cell  lines;  3-8,  normal-tumor  pairs;  9-10,  normal  cell  lines,  11-16  cancer  cell  lines;  17-20, 
normal-tumor  pairs;  21,  unpaired  tumor;  22-23,  normal-tumor  pair.  Actin  control  and  retroviral  reverse 
transcriptase  bands  arc  indicated. 


PCR  of  flanking  genomic  sequences. 

PCR  was  performed  using  a  primer  specific  for  the  LTRs  of  members  of  the 
HERV-H  family  (RTVL-H2,  RGH-1,  RGH-2).  Specific  targeting  of  HER  V  LTRs  was 
confirmed  by  the  use  of  nested  primers  generating  products  10  basepairs  shorter  (Figure 
2).  The  first  PCR  step  involved  a  primer  specific  for  the  LTRs  of  the  members  of  the 
HERV-K  family  (K10,  1-IML6),  under  low  stringency  conditions.  The  second  primer  was 
specific  for  the  LTRs  of  the  member  of  the  HERV-H  family  (RTVL-H2,  RGH-1 ,  RGH- 
2).  The  PCR  was  run  under  high  stringency  conditions  to  ensure  specific  sampling  of 
HERV-H  LTRs.  The  protocol  was  elaborated  using  cancer  cell  lines,  then  five  matching 
pairs  of  normal  and  tumor  tissue  DNAs  were  screened  for  differences  between  normal 
and  tumor  tissues.  Only  one  difference  was  detected  in  these  fingerprints. 

Figure  2  contains  LTR  to  arbitrary  products  for  5  normal-tumor  pairs,  with  the  original 
primer  pair  in  the  left  panel  and  the  nested  pair  in  the  right  panel.  Each  sample  is 
represented  in  two  lanes,  where  input  RNA  concentrations  were  titrated  by  a  factor  of 
two.  The  effect  observed  (arrows  point  several  of  the  many  examples)  indicates  that  the 
initial  primer  pair  targets  predominantly  retroviral  sequences  in  the  genome,  because 
nesting  the  primers  with  respect  to  the  original  primers  according  to  known,  conserved 
sequence  causes  a  10  base  pair  shift  consistent  with  the  10  base  pair  nesting  from  one 
end.  Sequence  analysis  of  several  of  these  bands  confirmed  that  these  bands  usually 
derive  from  an  LTR  on  one  end,  and  arbitrary  priming  at  the  other.  This  experiment 
raised  the  interesting  possibility  that  genomic  clone  (BAC  or  YAC)  arrays,  which  are 
now  commercially  available,  could  be  used  to  discover  new  retroviral  insertion  sites.  We 
have  performed  such  a  study  since  the  last  annual  report,  the  results  of  which  are 
introduced  in  the  section  entitled  “Recent  work’’. 

Figure  2.  Nested  PCR  shows  that  retroviral  sequences  are  targeted. 

Normal-tumor  pairs.  Left:  A  primer  directed  outward  from  the  LTR  was  used  in  conjunction  with  arbitrary 
priming  to  amplify  sequences  flanking  retroviral  sequences  in  the  genome.  Right;  nested  primer  secondary 
amplification  of  the  right  hand  fingerprint.  Lanes  1 ,3, 5, 7,9  =  tumor;  Lancs  2,4,6,8,10.  normal. 


6 


Figure  3.  Additional  example  of  LTR  to  Arbitrary  sampling  of  retroviral  flanking  sequences. 


This  example  also  contains  normal-tumor  pairs,  shows  the  shift  due  to  nesting, 
polymorphism  (indicated  by  the  arrow  )  and  a  possible  rclroposiiion  event  (boxed). 
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and  in  addition  shows  both  a 


We  have  succeeded  in  targeting  PCR  primers  the  HERV  long  terminal  repeats  and 
reverse  transcriptase  in  various  cancer  cell  lines  and  tumor  tissues.  A  retroviral  activity 
corresponding  to  different  members  of  the  two  main  HERV  families  was  evidenced  in 
cell  lines  and  in  tissues.  No  new  integration  site  has  been  discovered  yet  but  differences 
observed  in  normal  versus  tumor  fingerprints  and  DNA  array  hybridizations  are  under 
further  investigation  (isolation  and  sequencing  for  identification). 

Bubble-PCR,  RDA,  and  SSH 

As  explained  in  the  previous  annual  report,  these  methods  originally  proposed 


were  unsuccessful  due  to  the  apparent  low  frequency  of  novel  retrovirus  transposition  and 
problems  intrinsic  to  PCR,  namely,  insufficient  specificity  of  PCR  when  primers  are  not 
exact  matches  for  the  targeted  sequences.  The  background  due  to  non-specific  priming 
was  simply  too  high  for  these  methods  to  fully  resolve  new  transposition  events.  Thus, 
we  proposed  to  continue  our  work  along  the  direction  of  using  PAC  arrays  of  genomic 
clones  probed  with  retroviral  flanking  sequences  to  screen  cancers  for  novel  insertion 
events.  As  stateed  previously,  “A  new  retroposition  event  will  be  accompanied  by  new, 
unique  flanking  sequences  which  should  show  up  as  hybridization  to  a  new  BAC  or  YAC 
clone”.  As  it  turns  out,  we  used  PAC  arrays. 

Recent  progress 

While  we  have  not  succeeded  in  demonstrating  any  novel  insertional  event  of 
retroviral  sequences  in  breast  cancer,  with  our  modified  aims  we  have  fully  developed  a 
method  that  should  be  able  to  detect  novel  insertional  events  should  they  occur  for  about 
half  of  the  genome.  The  strategy  works  as  follows.  Probe  is  made  from  genomic  DNA  of 
normal-tumor  pairs  using  an  oligonucleotide  homologous  to  the  retroviral  LTR  (in  this 
case,  HERV).  Two  oligos  were  designed  (LTR2  and  LTR3)  homologous  to  the  two  ends 
of  the  HERV  LTR.  These  primers  are  directed  toward  the  most  highly  conserved 
sequences  in  the  LTR.  The  initial  round  of  synthesis  begins  with  one  of  these  primers  and 
extends  from  two  positions  do  to  the  direct  repeat  nature  of  the  LTR.  The  second  strand 
synthesis  is  supported  by  arbitrary  priming,  where  the  same  primer  used  for  the  first 
strand  arbitrarily  primes  on  the  product  from  the  first  round.  Thus,  the  final  PCR 
amplifiable  products  have  either  LTR2  or  LTR3  at  both  ends,  depending  on  which  was 
used.  LTR2  and  LTR 3  are  alternatives,  and  not  used  in  the  same  synthesis  reaction. 
Synthesis  of  the  retroviral  genome  (i.e.  between  the  LTRs)  is  inevitable.  Second  strand 
synthesis  by  arbitrary  priming  within  the  retroviral  genome  must  happen  only  once  in  a 
while  to  guarantee  that  some  retroviral  sequence  will  be  represented  in  the  probe,  and  in 
fact  it  is  predominant.  Sequences  outside  of  the  retrovirus,  per  se,  are  amplified  when  the 
outward-facing  primer  is  accompanied  by  an  arbitrary  priming  event  toward  the 
retrovirus.  These  events  capture  unique  sequence,  and  if  a  new  retroviral  insertion  event 
occurs,  this  is  where  the  novel,  unique  flanking  sequence  will  be  captured.  Once  a  probe 
has  been  synthesized  in  the  manner  described,  hybridization  to  arrays  of  PAC  clones  can, 
in  principle,  reveal  new  insertion  events. 

The  PAC  arrays  we  have  used  contain  average  150  kb  insertions,  such  that  the 
entire  mammalian  genome  can  be  contained  in  a  minimal  non-overlapping  set  of  about 
20,000.  Since  there  are  10,000  HER  Vs  in  the  genome,  about  half  of  these  20,000  contain 
retrovirus  derived  sequences.  New  insertion  events  into  PAC-defined  regions  that  already 
contain  a  retrovirus  will  be  invisible  to  this  approach  because  the  probes  contain  large 
amounts  of  the  retroviral  genome  internal  to  the  LTRs.  However,  the  other  half  of  the 
genome  can  be  probed  because  they  do  not  contain  retroviral  genomic  sequence. 
Improving  matters  somewhat  are  the  facts  that  many  of  the  older  pro-retroviruses  are 
diverged  to  the  point  that  they  are  not  detected  by  high  stringency  probing  and  that  in 
many  cases  only  a  single  LTR  is  found.  In  these  cases,  new  retroviral  insertion  events 
will  likely  have  higher  homology  to  preexisting  retroviral  sequences,  and  when  only  a 
single  LTR  is  present,  only  unique  flanking  sequence  will  be  detectable  by  hybridization. 

In  one  experiment,  we  wanted  to  calibrate  the  individual-to-individual  variation 
which  attends  arbitrary  priming.  Figure  4  shows  examples  comparing  two  individuals 
using  LTR2.  There  were  very  few  differences  (<  0.05%)  commensurate  with  individual 
variation  among  humans.  The  differences  probably  arise  mostly  from  differences  in  the 
arbitrary  priming  step  due  to  sequence  variation  between  individuals,  but  can  also  arise 
from  differences  in  the  positions  of  pro-retroviruses.  This  experiment  shows  that  the 
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probe  synthesis  and  hybridization  methods  are  robust,  and  shows  that  we  can  detect  a 
number  of  PACs  roughly  equivalent  to  the  estimated  10,000  predicted  to  harbor  pro¬ 
retroviruses. 

Figure  4.  Probes  made  using  the  same  primers  using  DNA  from  two  individuals, 
a.  First  individual 


Next,  we  compared  the  same  normal  DNA  sample  using  either  LTR2  or  LTR3.  It  is 
predicted  that  the  major  source  of  difference  between  these  two  probes  is  due  to  the 
distribution  of  single  LTRs.  As  mentioned  above,  complete  pro-retroviruses  contain  the 
internal  genomic  sequences,  and  should  be  detected  by  probes  derived  from  either  LTR2 
or  LTR3.  Isolated  LTRs  could  also  contribute  probe  from  either  primer,  but  sequence 
variation  in  isolated  LTRs  can  result  in  detection  by  one  and  not  the  other,  as  can  the 
presence  or  absence  of  an  effective  arbitrary  priming  site.  Thus,  in  the  experiment  shown 
in  Figure  5,  many  differences  can  be  seen  between  hybridizations  derived  from  the  two 
oligos,  and  these  differences  should  reflect,  by  and  large,  PACs  that  contain  isolated 
LTRs.  This  difference  is  predicted  to  go  away  when  a  full  length  novel  retroviral 
insertion  event  occurs.  Because  of  the  continuous  genetic  drift  that  has  resulted  in 
divergence  of  pro-retroviral  insertions,  and  because  it  is  difficult  to  know  how  efficiently 
arbitrary  priming  captures  each  pro-retrovirus,  it  is  difficult  to  precisely  estimate  the 
number  of  pro-retroviruses  and  isolated  LTRs.  For  recent  novel  insertional  events, 
identical  retroviral  sequences  would  appear  in  more  than  one  PAC,  and  these  sequences 
would  hybridize  at  high  stringency.  At  high  stringency,  we  observe  signals  from  10-20% 
of  all  PACs.  This  is  a  further  improvement  in  the  ability  of  the  method  to  detect  new 
insertion  events  over  the  >50%  estimated  above,  because  most  of  the  endogenous  pro- 
retroviral  sequences  will  be  unique  and  only  able  to  hybridize  to  themselves.  Presumably, 
those  retroviruses  still  capable  of  moving  will  have  retained  certain  essential  features, 
such  as  promoter  sequences,  but  the  majority  will  fail  in  high  stringency  hybridization, 
leaving  most  PACs  unhybridized.  This,  in  turn,  will  make  new  insertion  events  into  these 
PACs  easier  to  detect.  Finally,  if  differences  on  the  order  of  two-fold  can  be  detected 
using  better  technology,  such  as  glass-based  arrays,  then  the  majority  of  PACs  should 
become  accessible. 

In  an  experiment  where  normal  and  tumor  DNAs  were  compared,  no  novel 
retroviral  insertion  events  were  identified.  This  is  not  surprising  because  the  proposed 
mechanism,  oncogenesis  by  retroviral  insertion,  is  not  expected  to  occur  in  all  cancers. 
Flowever,  the  result  indicates  that  the  method  should  be  very  robust  against  false 
positives.  This  experiment  is  shown  in  Figure  3. 

Figure  5  a  and  b.:  Probe  derived  from  a.  LTR2-specific  and  b.  LTR2-specific. 


Figure  6a.  Normal  DNA 


Close  inspection  of  these  figures  reveals  high  reproducibility  and  no  candidate 
retroinsertion  events. 

Conclusion: 

The  original  strategies  proposed  for  this  project  failed.  Those  strategies  were  based 
on  "bubble  PCR"  in  which  retroviruses  were  intended  to  be  specifically  amplified  from 
genomic  DNA.  This  strategy  fails  because  arbitrary  priming  events,  while  rare,  are 
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favored  by  the  enormous  complexity  of  the  human  genome.  Recognizing  this  problem, 
we  devised  the  new  strategy  involving  PAC  arrays  as  described  above.  This  new  strategy 
can,  in  principle,  detect  more  than  half  of  all  new  retroviral  insertion  events.  The 
hybridizations  are  based  on  the  presence  in  the  probe  of  unique  flanking  sequence  around 
retroviruses.  The  retrovirus  sequence  itself  contributes  to  background  by  hybridizing  to 
all  PACs  that  contain  retrovirus  sequences,  and  thus,  only  those  insertion  events  in  PACs 
that  did  not  contain  retrovirus  sequences  in  the  individual  from  which  the  PAC  library 
was  made  are  accessible  to  the  method.  In  future  experiments,  however,  one  can  envision 
blocking  these  "background"  retroviral  sequences  with  pure  retroviral  driver,  such  that 
the  unique  flanking  sequence  around  novel  insertion  events  could  be  detected.  This 
would  make  the  entire  genome  accessible  to  the  method.  Thus,  we  have  partially 
succeeded  in  our  original  goal  of  devising  a  method  that  will  allow  the  detection  of 
retroviral  sequences  for  about  half  of  the  genome,  and  have  discerned  a  clear  path  toward 
completing  development  of  the  method. 
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